Search CORE

4 research outputs found

A small Griko-Italian speech translation corpus

Author: Anastasopoulos A.
Besacier L.
Lekakou M.
Villavicencio A.
Zanon Boito M.
Publication venue: 'International Speech Communication Association'
Publication date: 27/07/2018
Field of study

This paper presents an extension to a very low-resource parallel corpus collected in an endangered language, Griko, making it useful for computational research. The corpus consists of 330 utterances (about 2 hours of speech) which have been transcribed and translated in Italian, with annotations for word-level speech-to-transcription and speech-to-translation alignments. The corpus also includes morpho syntactic tags and word-level glosses. Applying an automatic unit discovery method, pseudo-phones were also generated. We detail how the corpus was collected, cleaned and processed, and we illustrate its use on zero-resource tasks by presenting some baseline results for the task of speech-to-translation alignment and unsupervised word discovery. The dataset will be available online, aiming to encourage replicability and diversity in computational language documentation experiments

arXiv.org e-Print Archive

Crossref

Hal - Université Grenoble Alpes

White Rose Research Online

A Very Low Resource Language Speech Corpus for Computational Language Documentation Experiments

Author: Adda G.
Adda-Decker M.
Benjumea J.
Besacier L.
Cooper-Leavitt J.
Godard P.
Kouarata G-N.
Lamel L.
Maynard H.
Mueller M.
Rialland A.
Stueker S.
Yvon F.
Zanon-Boito M.
Publication venue
Publication date: 15/02/2018
Field of study

Most speech and language technologies are trained with massive amounts of speech and text information. However, most of the world languages do not have such resources or stable orthography. Systems constructed under these almost zero resource conditions are not only promising for speech technology but also for computational language documentation. The goal of computational language documentation is to help field linguists to (semi-)automatically analyze and annotate audio recordings of endangered and unwritten languages. Example tasks are automatic phoneme discovery or lexicon discovery from the speech signal. This paper presents a speech corpus collected during a realistic language documentation process. It is made up of 5k speech utterances in Mboshi (Bantu C25) aligned to French text translations. Speech transcriptions are also made available: they correspond to a non-standard graphemic form close to the language phonology. We present how the data was collected, cleaned and processed and we illustrate its use through a zero-resource task: spoken term discovery. The dataset is made available to the community for reproducible computational language documentation experiments and their evaluation.Comment: accepted to LREC 201

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

Proposta de métricas de avaliação da qualidade da informação médica para Sistemas de Recomendação baseados no perfil do usuário

Author: Astiazara Mauricio Volkweis
Boito Francieli Zanon
dos Anjos Julio Cesar Santos
Lutz João Adolfo Froede
Moraes Tiago Guimarães
Nobre Jeferson Campos
Palazzo M. de Oliveira José
Pereira dos Santos Henrique Dias
Sklar Márcio Muccillo
Weitzel Leila
Yamashita Marcelo Corrêa
Publication venue: Cadernos de Informática
Publication date: 01/01/2010
Field of study

A Web é uma fonte de busca onde as pessoas procuram informações sobre cuidados em saúde. Entretanto, é aberta a vários tipos de publicação e provedores de informação, portanto a qualidade das informações em saúde que são publicadas são altamente variáveis e dinâmicas. Um usuário leigo que busca informação nem sempre possui o conhecimento e educação suficientes para avaliar e validar a informação disponível. Neste relatório aborda-se um sistema de recomendação baseado no perfil do usuário e na qualidade da informação recomendada

Em Questao

Archives of the Faculty of Veterinary Medicine UFRGS

Lume 5.8

Proposta de métricas de avaliação da qualidade da informação médica para Sistemas de Recomendação baseados no perfil do usuário

Author: Astiazara Mauricio Volkweis
Boito Francieli Zanon
dos Anjos Julio Cesar Santos
Lutz João Adolfo Froede
Moraes Tiago Guimarães
Nobre Jeferson Campos
Palazzo M. de Oliveira José
Pereira dos Santos Henrique Dias
Sklar Márcio Muccillo
Weitzel Leila
Yamashita Marcelo Corrêa
Publication venue: Cadernos de Informática
Publication date: 30/09/2010
Field of study

Em Questao